13 research outputs found
Opportunities for Adaptive Experiments to Enable Continuous Improvement that Trades-off Instructor and Researcher Incentives
Randomized experimental comparisons of alternative pedagogical strategies
could provide useful empirical evidence in instructors' decision-making.
However, traditional experiments do not have a clear and simple pathway to
using data rapidly to try to increase the chances that students in an
experiment get the best conditions. Drawing inspiration from the use of machine
learning and experimentation in product development at leading technology
companies, we explore how adaptive experimentation might help in continuous
course improvement. In adaptive experiments, as different arms/conditions are
deployed to students, data is analyzed and used to change the experience for
future students. This can be done using machine learning algorithms to identify
which actions are more promising for improving student experience or outcomes.
This algorithm can then dynamically deploy the most effective conditions to
future students, resulting in better support for students' needs. We illustrate
the approach with a case study providing a side-by-side comparison of
traditional and adaptive experimentation of self-explanation prompts in online
homework problems in a CS1 course. This provides a first step in exploring the
future of how this methodology can be useful in bridging research and practice
in doing continuous improvement
Impact of Guidance and Interaction Strategies for LLM Use on Learner Performance and Perception
Personalized chatbot-based teaching assistants can be crucial in addressing
increasing classroom sizes, especially where direct teacher presence is
limited. Large language models (LLMs) offer a promising avenue, with increasing
research exploring their educational utility. However, the challenge lies not
only in establishing the efficacy of LLMs but also in discerning the nuances of
interaction between learners and these models, which impact learners'
engagement and results. We conducted a formative study in an undergraduate
computer science classroom (N=145) and a controlled experiment on Prolific
(N=356) to explore the impact of four pedagogically informed guidance
strategies and the interaction between student approaches and LLM responses.
Direct LLM answers marginally improved performance, while refining student
solutions fostered trust. Our findings suggest a nuanced relationship between
the guidance provided and LLM's role in either answering or refining student
input. Based on our findings, we provide design recommendations for optimizing
learner-LLM interactions
Exploring The Design of Prompts For Applying GPT-3 based Chatbots: A Mental Wellbeing Case Study on Mechanical Turk
Large-Language Models like GPT-3 have the potential to enable HCI designers
and researchers to create more human-like and helpful chatbots for specific
applications. But evaluating the feasibility of these chatbots and designing
prompts that optimize GPT-3 for a specific task is challenging. We present a
case study in tackling these questions, applying GPT-3 to a brief 5-minute
chatbot that anyone can talk to better manage their mood. We report a
randomized factorial experiment with 945 participants on Mechanical Turk that
tests three dimensions of prompt design to initialize the chatbot (identity,
intent, and behaviour), and present both quantitative and qualitative analyses
of conversations and user perceptions of the chatbot. We hope other HCI
designers and researchers can build on this case study, for other applications
of GPT-3 based chatbots to specific tasks, and build on and extend the methods
we use for prompt design, and evaluation of the prompt design
ABScribe: Rapid Exploration of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models
Exploring alternative ideas by rewriting text is integral to the writing
process. State-of-the-art large language models (LLMs) can simplify writing
variation generation. However, current interfaces pose challenges for
simultaneous consideration of multiple variations: creating new versions
without overwriting text can be difficult, and pasting them sequentially can
clutter documents, increasing workload and disrupting writers' flow. To tackle
this, we present ABScribe, an interface that supports rapid, yet visually
structured, exploration of writing variations in human-AI co-writing tasks.
With ABScribe, users can swiftly produce multiple variations using LLM prompts,
which are auto-converted into reusable buttons. Variations are stored
adjacently within text segments for rapid in-place comparisons using mouse-over
interactions on a context toolbar. Our user study with 12 writers shows that
ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances
user perceptions of the revision process (d = 2.41, p < 0.001) compared to a
popular baseline workflow, and provides insights into how writers explore
variations using LLMs
Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health
Digital mental health (DMH) interventions, such as text-message-based lessons
and activities, offer immense potential for accessible mental health support.
While these interventions can be effective, real-world experimental testing can
further enhance their design and impact. Adaptive experimentation, utilizing
algorithms like Thompson Sampling for (contextual) multi-armed bandit (MAB)
problems, can lead to continuous improvement and personalization. However, it
remains unclear when these algorithms can simultaneously increase user
experience rewards and facilitate appropriate data collection for
social-behavioral scientists to analyze with sufficient statistical confidence.
Although a growing body of research addresses the practical and statistical
aspects of MAB and other adaptive algorithms, further exploration is needed to
assess their impact across diverse real-world contexts. This paper presents a
software system developed over two years that allows text-messaging
intervention components to be adapted using bandit and other algorithms while
collecting data for side-by-side comparison with traditional uniform random
non-adaptive experiments. We evaluate the system by deploying a
text-message-based DMH intervention to 1100 users, recruited through a large
mental health non-profit organization, and share the path forward for deploying
this system at scale. This system not only enables applications in mental
health but could also serve as a model testbed for adaptive experimentation
algorithms in other domains
Prototyping Text Mining and Network Analysis Tools to Support Netnographic Student Projects
Social science is witnessing tremendous growth of data available on the Internet regarding social phenomena; however, social science students are typically not prepared for managing the challenges and opportunities of analysing online data. One of the areas where this growth is especially important is in social studies of consumption. This article discusses a prototype of a visualisation tool intended to support the learning of netnographic analysis with computational tools
Predictors of Academic Achievement in Blended Learning: the Case of Data Science Minor
This paper is dedicated to studying patterns of learning behavior in connection with educational achievement in multi-year undergraduate Data Science minor specialization for non-STEM students. We focus on analyzing predictors of aca-demic achievement in blended learning taking into account factors related to initial mathematics knowledge, specific traits of educational programs, online and of-fline learning engagement, and connections with peers. Robust Linear Regression and non-parametric statistical tests reveal a significant gap in achievement of the students from different educational programs. Achievement is not related to the communication on Q&A forum, while peers do have effect on academic success: being better than nominated friends, as well as having friends among Teaching Assistants, boosts academic achievement